Universal Higher Order Grammar 



Victor Gluzberg 
HOLTRAN Technology Ltd 
gluzberg@netvision. net .il 

Abstract 

We examine the class of languages that can be defined entirely in terms 
of provability in an extension of the sorted type theory [Ti/n) by embedding 
the logic of phonologies, without introduction of special types for syntactic 
entities. This class is proven to precisely coincide with the class of logically 
closed languages that may be thought of as functions from expressions to sets 
of logically equivalent T?/„ terms. For a specific sub-class of logically closed 
languages that are described by finite sets of rules or rule schemata, we find 
effective procedures for building a compact Ti/n representation, involving 
a finite number of axioms or axiom schemata. The proposed formalism is 
characterized by some useful features unavailable in a two-component archi- 
tecture of a language model. A further specialization and extension of the 
formalism with a context type enable effective account of intensional and 
dynamic semantics. 

1 Introduction 

Traditionally higher-order logic representation of natural language semantics 
(Thomason, 1974) had to be combined with an additional formalism to de- 
scribe a syntactic structure of a language and a mapping between the two. 
This two-component architecture of a language model has not essentially 
changed with the invention and further rapid development of type-logical 
grammar (Lambek, 1958; Blackburn et ai, 1997), i.e. a parallel logical for- 
malism to describe a syntactic structure: in spite of the very close and deep 
correspondence between the two logics, their internal languages and seman- 
tics remain different. 

Several theoretical, methodological and technological issues are rooted 
in the two-component architecture of a language model, out of which it is 
important to mention here the following. 
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1. A language in such a model cannot express anything about another 
language of the same model nor, of course, about itself. This kind of 
expressiveness is, however, one of the characteristic capabilities of a 
natural language. 

2. Lack of lexical robustness. As the semantic interpretation of a sen- 
tence in the two-component language model can be composed only 
from semantic interpretations of all its constituents, such a model fails 
to interpret an entire sentence if it contains even a single unrecognized 
(new) word. Some ad hoc add-ins to the formalism are the only known 
solution of the issue. 

3. Lack of structural (syntactic) robustness. Unlike the semantic logic, 
which may be universally applicable for a wide variety of languages, 
the syntactic logic often requires special extensions in order to cover 
different languages and even specific structures in the same language. 
Capturing the semantic categories and their relationships in the very 
formalism (rather than in a concrete language model) makes it princi- 
pally incapable of modeling language self-learning, that is derivation of 
new grammar rules from text samples. 

These, along with some other issues and needs motivated researches for 
generalizations based on a single logical system, applicable to both semantics 
and syntactic structure of a language together, such like (Kasper & Rounds, 
1986), (King, 1989) and (Richter, 2004). Higher Order Grammar (HOG) 
(Pollard & Hana, 2003; Pollard, 2004; Pollard, 2006) is probably the most 
recent implementation of the idea and the first one based entirely on the 
mainstream classical higher-order logic (HOL), traditionally applied only to 
the semantics of natural languages. The specific HOL employed in HOG 
comes in result of the following main steps: 

1. introduction of a phonological base type and a constant denoting op- 
eration of concatenation of phonologies; 

2. introduction of abstract syntax entity base types and constructors of 
derived types, comprising (together with these base types) a type kind 
SIGN; 

3. introduction of another type kind HIRER for semantic interpretations, 
which consists of basic types of individuals and propositions and derived 
types - functional, product and of function to the boolean type; 
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4. embedding all the three logics - of phonologies, signs and hyperinten- 
sions - into a single HOL with fixing a correspondence between SIGN 
and HIRER types and adding the two special families of constants de- 
noting functions from signs to phonologies and from signs to semantic 
interpretations of corresponding types. 

A concrete language is modeled in HOG by adding a set of non-logical 
axioms to postulate semantics and phonology of specific words and rules of 
their composition. 

Due to introduction of the phonological type and constants for mapping 
signs to phonologies, HOG model is certainly better prepared for a formal ac- 
count of lexical robustness. Also, since directionality in HOG can be handled 
by the phonological interpretation, it is sufficient for it to have a single uni- 
versal sign type constructor (implication) that partially addresses the issue 
of structural robustness. However, as the syntactic and semantic type kinds 
are fully separated, HOG still cannot address the issue of self-expressiveness. 

Another important implication of the HOG architecture, as it was noticed 
in (Gluzberg, 2009), is that semantic interpretation of any sentence or a 
constituent of it turns out to be represented not by a single HOL term, nor 
even by a set of some arbitrarily selected terms (in case of ambiguity), but 
by a whole class of logically equivalent terms, in a precise sense defined in 
the referenced work. Referring to languages revealing this semantic property 
as to logically closed languages, one can say that all HOG-defincd languages 
are necessarily logically closed. It was also explained in (Gluzberg, 2009) 
why the inverse question, i.e. whether any logically closed language can be 
defined by a HOG with a given set of base SIGN types and type constructors, 
cannot be answered positively. This result gives another evidence of the 
limited robustness of the HOG model. 

In the present work we examine what can be achieved by embedding into 
a HOL only the logic of phonologies, without introduction of special types for 
syntactic entities, but with use of non-logical constants of regular functional 
types to define syntax-to-phonology and syntax-to-semantics interfaces in a 
concrete language model. As such a language representation is, as well as 
HOG, entirely based on provability in the single HOL, all the languages it 
can model are still logically closed. The main result of this work consists 
of a proof of the inverse statement: any logically closed language can be 
represented in the proposed HOL framework. This justifies our referring 
to it as to Universal Higher-Order Grammar. Being fully free of embedded 
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syntactic restrictions and of limitations on types of semantic interpretations, 
this formalism can efficiently address all the issues with the two-component 
architecture of a language model mentioned above and therefore may have 
several theoretical and practical implications. 
The structure of the paper is as follows. 

After the introduction in Section 2 of basic notations and some assump- 
tions on the axiomatization of a sorted type theory (Ty„), in Section 3 we 
define the class of logically closed languages and introduce a few basic oper- 
ations that act within this class. 

In Section 4 we define an extension Ty:^ of Tyn by interpreting one of 
its base types as symbolic, which is quite similar to the phonological type 
of HOG, and introduce the notion of T|/„-representability of a language, 
illustrated with a few preliminary examples. 

Then in Section 5 we prove that classes of T|/„-representable and logically 
closed languages precisely coincide. 

In the following sections we give explicit constructions of special Ty^ 
representations for some important sub-classes of logically closed languages 
that are described by finite sets of rules or rule schemata. 

In Section 6 we consider so-called lexicons, i.e. finitely generated logically 
closed languages, define special canonic Tyn representation and prove that a 
language has a canonic representation if and only if it is a lexicon. 

In Section 7 a Ty„ representation is built for a recursive logically closed 
grammar {TZQ), defined as a tuple of logically closed languages linked with 
each other by a set of relationships expressed by the basic operations in- 
troduced in Section 3. As it becomes clear from illustrating examples, the 
language components of an TZQ stand for syntactic categories. 

In Section 8 we consider a further specialization of Ty;^ by interpreting 
another base type as a context, similar (though not identical) to "state of 
affairs" or "World" types of (Gallin, 1975) and (Pollard, 2005), respectively. 
This specialization allows to represent also intensional languages and further 
define instructive and context-dependent languages. Introduction of a few 
new language construction operations then allows us to generalize the pre- 
vious results to context-dependent languages. We demonstrate some impor- 
tant capabilities of this formalism by examples of how it addresses pronoun 
anaphora resolution. 

In Section 9 we introduce translation and expression operators that lead 
to a special language representation, revealing a useful property of partial 
translation. 



4 



In Section 10 we conclude by briefly summarizing and discussing the most 
important implications of the obtained results and outline some open issues. 

2 Notations 

Following (Gallin, 1975), we denote by Ty^ a sorted type theory with a set 
of primitive types consisting of the truth type t and individual types Ci, 62, 
... e„ of n > 1 different sorts. For the first individual type we will also use 
a shorter alias e =def Ci- For sake of better visibility and compactness we 
combine the full syntax of Ty„ from syntaxes of Ty2 of (Gallin, 1975) and 
Qo of (Andrews, 1986) and extend it as follows. 

Derived types 

(i) If a and j3 are types, then (a^) or just aji is a functional type, inter- 
preted as type of functions from a to ^. The parenthesis are mandatory 
only in complex types, in order to express association in an order other 
than from right to left, for example: et, tee, (et)e are equivalent to (et), 
(t(ee)), ((et)e). Note that, unlike (Andrews, 1986), we write "from" 
and "to" types from left to right. 

(ii) If a and (5 are types, then (a x (5) or just a x /3 is a product type, 
interpreted as type of pairs of elements from a and ^, so that, for 
example, {a x /3)7 is equivalent to a^^ and 7(0; x ^) to 70; x 7^. 
Repeated product constructors are also associated from right to left, 
i.e. (a X /3 X 7) is equivalent to (a x (/3 x 7)). 

Variables 

X, y, z subscripted by types and optionally superscripted by integer indices 
stand for variables of corresponding types, for example: y^t, xl, x^. 

Constants 

Non-logical constants are written as capital C in bold, subscripted by a type 
and superscripted by an index, like Cj . We do not assume Tyn to necessarily 
have a constant of a given type a for any index i. Rather, we assume 
the constants to be indexed in such a way that admits expanding Tyn by 
arbitrarily many new constants of any type. We also use arbitrary letters or 
words in bold (hke Assert or Empty) as mnemonic ahases for some constant 
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that are assumed to have special semantics or/and satisfy some non-logical 
axioms in an extension of Ty„. Similar notations may also be introduced 
for some logical constants, i.e. pure bound terms, like, for example, identity 

laa def AXq,3^q,. 

Terms 

(i) Variables and constants comprise elementary terms. 

(ii) If Aa/3 and Ba are terms of the corresponding types, then application 
■Aai3 Ba is a term of type ^ denoting (in an interpretation of Ty„) the 
value of the function denoted by Aa/s at the argument denoted by Ba- 
lm) If Aj3 is a term of type P then lambda abstraction XxaAjs is a term 

of type a(3 denoting a function whose value for any argument is the 
denotation of A^. 

(iv) If Aa and Ba are terms of type a, then Aa — Ba is a term of type t, 
denoting the identity relation between elements of type a. 

(v) If Aa and are terms of the corresponding types, then the pair 
{Aa, Bp) is a term of type a x (3 denoting the ordered pair of de- 
notations of Aa and Bf^; repeated operators (, ) are associated from 
right to left, so that a tuple [A]^^, A"^^, ...A^ ) denotes the same as 

(vi) Finally, for a term AaxfS of a type ax 13, projections tti Aax (5 and Ti2Aaxp 
are terms of types a and j3 respectively, denoting the elements of the 
pair denoted by Aa^p- 

We assume an axiomatization of Ty„ be a straightforward generalization 
to case of n > 2 individual types of either the theory denoted as Ty„ + D 
in (GaUin, 1975) or the theory Qq of (Andrews, 1986). With either choice, 
for any type a the description operator i{at)a a-nd hence the " if-then-else" 
constant Caata a-re available with the fundamental properties 

l~ ^{at)a XXaiya ^o) Va 
^ ^aata -^a IJa T — Xa, l~ ^aata Xa Va F — Ua- 

All the usual Boolean connectives, including terms F and T denoting 
the false and true values, and quantifiers can be defined in Ty„ exactly the 
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same way as in Ty2 and Qq. In addition, we will also employ the following 
notational shortcuts: 





=def 


^{Aa^Ba), 




=def 


XXa ~ {Aat Xa), 


Aat V Bat 


=def 


\Xa{Aat Xa V Bat Xa), 


Aat A Bat 


=def 


\Xa{AatXa A BatXa), 


Aat & Bpt 


=def 


XXaXx^{AatXa & BptXp), 


Aat Bpt 


=def 


"^Xai^Aat Xa ^ Bat Xa)i 


Aa\Ba 


=def 


A R 



and, for an arbitrary binary operator O: 



(OB^) 


=def 


XXa{XaO B^), 


(AaO) 


=def 


Xxp{AaOx^), 


(0) 


=def 


XXaXx^{XaO X/s), 



(an operator sign in such shortcuts might be subscripted by a type, like, for 
example, {—eet) to denote specifically XxeXye{xe — He)- 
In the meta-language: 

(i) the h sign denotes provability in Ty„ or an extension of it, specified by 
a context; 

(ii) the notation Aa{Bi3) stands for the result of substituting all occurrences 
of a free variable in a term Aa by a term Bp free for in Aa] 

(iii) if M is a model of Tt/n and a - an assignment of variables in this model, 
then I'Jm denotes interpretation of a constant in M and [-jM.a denotes 
the value of an arbitrary term assigned to it by a in M; 

(iv) =^ and «^=^ express implication and logical equivalence. 

3 Logically closed languages 

The two fundamental formalizations of the notion of language with which we 

deal in this work - a-language and logically closed a-language have been in- 
troduced and informally discussed in (Gluzberg, 2009). We reproduce these 
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formal definitions here with the current notations for more convenient refer- 
ences. 

Definition 3.1. Let ^ be a finite alphabet {oi, 02, ■■■aAr}, let A* denote the 
set of all words over this alphabet and let Ta denote the set of all a-terms of 
Ti/n- An a- language is a relation C <Z A* <S> Ta- 

Referring to words over alphabet A as "expressions" and a-terms as "a- 
meanings," one can say that an a-languagc is a set of pairs of expressions 
and their a-meanings. A set of words C G A* can then be considered as a 
language for the unit type meaning. In general case, the projection of C to 
A* is the set of all vahd expressions of £, to which set we refer as to domain 
of the language and the projection of £ to 7^ is the set of all meanings JC 
can express, to which set we refer as to range of the language. 

A trivial particular case of an ct-language is a singleton {{w, Aa)}, where 
w e A*. 

If jC and K, are a-languages, then their union £ U /C is a new ct-language 
that contains all expressions and corresponding meanings of both C and /C. 

The following definitions introduce some further operations to build com- 
plex languages from simpler ones. 

Definition 3.2. If C is an a-language and /C is a ^-language, then their 
language concatenation is the a x ^-language 

C * /C =def {{uv, {Aa, Bp)) I (m, A„) e £ a {v, Bp) e JC} 

where uv denotes concatenation of words u and v. 

Definition 3.3. For a given a-language C and a relation TZ <zTa®Tp, the 
semantic rule application is the ^-language 

£ > 7^ =def {(w. Bp) I (w, Aa)eC ^ {Aa, Bp) e 7^}. 

Thus, application of a semantic rule cannot extend the domain of a lan- 
guage, but only maps the set of meanings of every its expression to another 
set (generally of a different type); if the new set turns out to be empty, then 
the corresponding expression is "filtered out" from the domain of resulting 
language. An important example of a semantic rule application is "folding" 
product-type meanings of a concatenation of two languages to meanings of a 
scalar type. Note that a folding rule which also filters out some expressions 
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from the language concatenation in fact can check an agreement between the 
concatenated constituents (at the semantic level). 

The above definition of an a-language seems to provide the most gen- 
eral formalization for the notion of a language whose distinct meanings are 
understood as syntactically different (although possibly logically equivalent) 
Tijn terms of a certain type. For example, the representation of Ty„ formu- 
las in the Ti/„ language is itself a t-language. In the application to natural 
languages, however, a more restricted formalization might be more suitable: 

Definition 3.4. A logically dosed a-language is an a-languagc C such that 
whenever {w, Ba) G C, {w, Co) G C and \- = B^y = Ca then 
(w, Aa) G C also. A minimal logically closed a-language C which includes a 
given arbitrary a-languagc C is said to be its logical closure. 

This definition actually captures the two important features of a logically 
closed a-language: 

1. If an expression w in the language has a meaning Sq,, then it also has 
every meaning A^ logically equivalent to Ba, 

2. If an expression w is ambiguous, i.e. has at least two distinct meanings 
Ba and Cq, being not logically equivalent, then it also has every meaning 
Aa which is provable to be equal either B^ or Cq. 

Therefore, every valid expression of a logically closed language is associ- 
ated not with a single, nor even with a set of arbitrarily selected terms (in 
case of ambiguity), but with a whole class of logically equivalent terms. A 
precise formulation of this interpretation follows. 

Definition 3.5. A set Al C 7^ is said to be logically closed iff whenever 
B^e Co,eM and 

^Aa = BayAa = Ca (3.1) 

then Aa G Ai also. A minimal logically closed set C 7^ which includes 
an arbitrary set C 7^ is said to be its logical closure. If in addition, 

we say the two sets A4 and J\f be logically equivalent 
and denote this relation as Al ~ jV. 
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It is readily seen that ~ is an equivalence relation in the power set V(Ta) 
and therefore a logically closed a-language might be defined equivalently as 



a function C : A* ^(7^)/ ^. 

The simplest non-empty logically closed a-language is a logical singleton 



As it has been shown in (Gluzberg, 2009), any language defined by a 
Higher Order Grammar (HOG) (Pollard & Hana, 2003; Pollard, 2004; Pol- 
lard, 2006) is necessarily logically closed. 

Note that the class of logically closed languages is not closed under the 
union operation nor under operations of language concatenation and semantic 
rule applications defined above. Similar operations that act within this class 
can be defined as follows. 

Definition 3.6. If C and /C are a-languages, then their logical join is an 
a- language CUK, —def CU K,. 

Definition 3.7. If JC is an a-language and /C is a ^-language, then their 
logical concatenation is an a x ^-language £ */C =def C*K,. 

Definition 3.8. A semantic rule TZ C Ta'S'Tp is said to be logically closed 
if the full image of any logically closed set Al C 7^ under the relation TZ is 
logically closed. 

Thus, a logical join and logical concatenation of arbitrary languages, as 
well as application of a logically closed semantic rule to any language of the 
matching type, are all logically closed languages. 

The following associativity and monotonicity properties follow directly 
from the above definitions. 

Lemma 3.1. For any a-languages }C,C, a ^-language M. and a rule 



(3.2) 



{}CUC)*M = }C*MUC*M, 
M*{ICUC) = M*ICU M*C, 
(/CD£)>7^ = /C>7^D£>7^, 
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Definition 3.9. A logically closed set C 7^ is said to he finitely generated 
if it is the logical closure of a finite set A4 <zTa- 

Definition 3.10. A logically closed semantic rule 7?. C 7^ (8) 7^ is said to 
be finitely ambiguous if for any term A^, the full image of {Aa} under the 
relation TZ is finitely generated. 

An important class of finitely ambiguous semantic rules is given by the 
following 

Definition 3.11. A logically closed semantic rule 7?. C 7^ (8) 7s is said to 
have a canonic representation in C 7^ if there exists a Ty:l^ term Ra/st 
such that 

{Aa, Bp) e TZ ■^=^ h Ra/St Aa Bp 
and for any Aa € Va 

k 

h Rapt Aa — \/{— Bp), 
i=l 

where the terms Bj^, ... B^ as well as the number k may depend on Aa and 
in the case A; = the disjunction is reduced to XxpF. We further qualify a 
rule with a canonic representation in Da G Ta s^s non- degenerate, degenerate, 
unambiguous or ambiguous in Va, ii k > for all Aa G Va, k — ior some 
Aa E T>a, k — 1 ioT all Aa E T>a 01 k > 1 for some Aa E T>a, respectively. 

An important example of a semantic rule with a canonic representation 
in any domain is given by a particular case where 

Ral3t =def ^Xa{= Faji ^a) 

which is referred to below as a functional semantic rule. Note that a func- 
tional rule is both non-degenerate and unambiguous in any domain. 

4 Symbolic type and T?/^-representable 
languages 

Consider an extension Ty:^ of Ty^ with the following set of non-logical ax- 
ioms, where s denotes the primitive type e„ and As + Bg =def C^sss Bs'- 

\fx,{{xs + C° = Xs) A (C° + X, = X,)), (4.1) 
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yXs^VsyZsiiXs + Vs) + Zs^Xs + ivs + Zs)), (4.2) 

yXsyysyZs{x, + Ci + ysy^Xs + Ci + z,), foralll<i<j <7V. (4.3) 

It is easy to see that these axioms are vahd in a model M with the s-domain 
Vg =def A* and interpretation [-Jm such that |C°]m is the empty word 
e E A*. |C*|m is flj G ^ for 1 < i < A^, all the rest of the s-constants are 
interpreted by compound words from A* and, finally, the operation (+) is 
interpreted as the operation of word concatenation. This observation proves 
the consistency of Ty;;^ and justifies our referring to the type s as symbolic 
type. Note that formally, with the accuracy up to the additional axiom 4.3, 
the symbolic type is quite similar to the phonological type of HOG (Pol- 
lard, 2006). We use the different term here in order to reflect the additional 
axiomatic restriction, expressing distinction of the alphabet symbols, signifi- 
cant in any context, and also keeping in mind that generally they may stand 
for symbols of an arbitrary nature, for example, graphic, as well as phonetic, 
in which case Ty:^ might be further extended to accommodate more symbol 
aggregation operations in addition to the linear concatenation. 

Ty:^ allows some a-languages to be naturally defined by its terms or 
possibly by the terms of its extension Ty:^~^ with a set of additional non- 
logical constants. Indeed, define the mapping : ^* — > 7^ as follows 

TA{e) = Cl T4(a,) = C^, T^ia^w) = CI + Ta{w), (4.4) 

and let A be a consistent set of Ty:^'^ formulas. Then the condition 

A h Ls^t Ta{w) Ao, 

for any Ty^^ term Lgat defines an a- language (h here and everywhere below 
denotes provability in Ty^'^). 

Definition 4.1. Let Ty:^^ be an extension of Ty;^, A a consistent set of Ty;^^ 
formulas and Lgat a Ty:^^ term. An a-language C is said to be represented 
by {A, Lsat) if for any Ty:^ term Aa and w E A* 

{w, Aa)eC <^ Ah Lsat Ta{w) A^. 

Note that in the case of a finite set A, by the Deduction Theorem, a 
language represented by (A^Lgat) is also represented by (0, XxsXxa{Dt 
Lsat^sXa)) where Dt is the conjunction of all formulas of A and the variables 
Xs, Xa are not free in Df. In this case we say the language to have a compact 
Tyn representation or, alternatively, to be represented by a term. 
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Definition 4.2. An a-language L is said to be representable in Ty;^, or Ty„- 
representable for short, if there exists an extension Ty^^ of Ty^, a consistent 
set A of Ty:^^ formulas and a Ty:^^ term Lsat such that C is represented by 

(A, Lsat)- 

Here are a few simple examples of Ty^-representable languages: 

• the term (= Ta{w)) &(= A^) represents a logical singleton (3.2); 

• the term Xxs{— C^ga ^s) represents an a- language where every word w 
is a valid expression for the apphcation 0°^ 3a ('W^); 

• the term {=sst) represents an s-language that realizes the mapping 
(4.4), i.e. associates with every word w a single "meaning" Ta{w). 

In case of alphabet A consisting of symbols that can be typed in this 
paper (including a space), a more convenient notation for mapping T4 can be 
introduced: if w is an arbitrary string of such symbols, let /w/ =def Ta{w). 
Then, by definition 4.1, for arbitrary strings u^v^ juj + jvj = /uv/; for 
example: /c/ + /a/ + /r/ = / car/. We will make use of this practical notation 
in some of subsequent sections. 

5 Ti/n representation existence theorem 

We now show that an a-language is Ty„-representable if and only if it is 
logically closed. 

Proposition 5.1. If an a-language C is Tyn-representable, then it is logically 
closed. 

Proof. Let (A, Lsat) be a representation of £, w e A*, 

A h Lsat Ta{w) Ba, A I- Lsat Ta{w) Ca and \- Aa^ Ba\/ Aa = Ca- 

Prom this derivation, by the metatheorems 



hAtyBt = {At\Bt)B, 



(5.1) 



l~ (-Pa/3 ^a\PaP Ba) Cf — Pafi{{Aa\Ba) Ct), 



(5.2) 



it follows that 



^ -^a — {Ba\Ca) {Aa — Ca) 
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and 

h Lsat Ta{w) Aa = {Lsat Ta{w) \ L^at Ta{w) Co) {At = 

Thus the first two derivations imply A h LsatTA{w) Aa, that is, {w, Aa) € C 
also. □ 

Lemma 5.2. In any Ty:^ model, the s-domain Vg contains a subset V'^ 
isomorphic to A* with respect to concatenation operations and such that 
iTA{w)jM e V'^ for any we A*. 

Proof. Let M be a Ty;^ model and q =def IC*]m for i — 0, ... N. Define the 
mapping Va ■ A* ^ Vg 8S follows: 

VA(e) = Co, VA(ai) = q, VA{ai + w) = q |+]m Va{w). 

Since axioms 4.1 - 4.3 are valid in M, is a one-to-one mapping and 

VAiu + v)^VAiu) l+jMVAiv). 

Thus the full image of A* under this mapping is isomorphic to A* and, 
by Definition 4.4, lTA{w)jM e for any we A*. □ 

This lemma actually allows to identify |T^(w)]m with w, |+]m - with 
word concatenation and the subset - with A* in any Ty^f model M. 

Lemma 5.3. If M. C Ta is logically closed, A}^ e Ai, ... A^ e M. and 
\- Aa^ Al ... V Aa^ A'^, then A^eM also. 

Proof. The proof is by induction on k. For k < 2, the statement follows 
directly from Definition 3.5. For A; > 2, by the metatheorems 5.1, 5.2 and 

h {{{A^\B^) Ct) = A^) V {{{A^\B^) Ct) = B^), 

we have 

I" = Bq, V Aa = A^, 

where 

Ba =def {...{All ...\A'a-'){Aa = ^^^jl^^^jl^a = ^^^j 

and therefore 

hS, = ... vs, = A^-^ 

Thus, if this implies that B^ e M., then A^ e M. also, according to Definition 
3.5. □ 
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Proposition 5.4. Let an a-language C be logically closed, let the constant 
C^Q,^ belong to Ty:^^ , but not to Ty:^, and let Ajt, be the minimal set ofTy;^^ 
formulas such that, whenever {w, A^) G C, the formula Q^gg^^TA{w)Aa belongs 
to Ac- Then {Ac,C^at) ^-^ ^ representation of C 

Proof. Consistency of the set A£ follows from the fact that all its formulas 
are valid in a model where the constant C^^.^ is interpreted by a function 
taking the true value for all its arguments. 

The implication (w, Aa) G £, =^ h C°^^( Ta{w) Aq, follows directly 
from the definition of Ac- 

To prove the opposite implication, assume that Ac \~ C^at ^a(^) and 
A^u is the (necessarily finite) subset of axioms from A^ participating in a 
proof of this derivation, so that 

A^hC^TAH^la (5.3) 

as well. Let A^,...A'^ be all those terms that occur in the formulas 

C'sat Ta{w) Al^ belonging to A^. If (w,Aa) ^ H-, then there exist a Ty^ 

model M and an assignment of variables a such that 7^ [^Ll-M",a for 

all i ~ 1, ■■■k. Indeed, due to the Completeness Theorem, we would otherwise 

have 

h Aa = A^ ... V Aa = A^ 

and therefore, by Lemma 5.3 and the assumption of logical closure of £, 
{w,Aa) e jC. Consider an extension M+ of the model M for Ty:;^'^ where 
[C^^Jm is the function taking the true value for all its arguments except for 
{w, lAalM,a) (where w stands for [Ta(u»)]]m, according to Lemma 5.2). All 
formulas of A^,, are satisfied by the assignment a in M"*", but C^^^ Ta{w) Aa 
is not. By the Soundness Theorem, this contradicts 5.3 and thus proves that 
{w,Aa)eC. □ 

Propositions 5.1 and 5.4 immediately imply 

Theorem 5.5. An a-language is Tyn-representable iff it is logically closed. 

Note that this general result is not constructive in the sense that it does 
not entail an effective procedure for actually building a Ty„ representation 
for an infinite language that might be defined by a finite set of rules or rule 
schemata. At the same time, it neither implies the uniqueness of the Ty^ 
representation and thus does not preclude a representation of such a language 
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that may be compact or described by a finite set of axiom schemata. We will 
find such representations for some important special classes of logically closed 
languages in the following sections. 

6 Lexicons and canonic representation 

Definition 6.1. An a — lexicon is a logically closed a-language which is the 
logical closure of a finite a-language. 

Definition 6.2. An a-language is said to have a canonic representation if it 
is represented by a term Lgat such that 

^ Lsct = A\) &(= Al) ... V A\) &(= Ai) 

where all A\ are s-constants and A^^ are Ty„ terms. 

In this section we show that any a-lexicon has a canonic representation. 

Lemma 6.1. h Ta{u) = Ta{v) iff u^v and \- Ta{u) ^ Ta{v) iff u^v. 

Proof. According to 4.1 - 4.3, if m = Z and ii = ji, ... ii — ji, then 

h Q ... +Q = Ci' ... +Ci"^ 

and if at least one of these conditions does not hold, then 

hc;i ... +c;' 7^ ... +c^. 

This proves both statements. □ 

Lemma 6.2. Let the mapping Tao from the set of a-languages to the power 
set V{Ts ®Ta) be defined by 

TAa{C) = {{TA{w),Aa) \ {w , A^) G £} . 

Then L is the logical closure of C iffTAai^^) is the logical closure ofTAa{j^)- 

Proof. As usual for any closure constructed by a ternary relation, a term 
belongs to a logical closure iff there exists a finite proof tree where this term 
is the root, every non-leaf node A^ has a pair of children B^, Ca that satisfy 
the relation (3.1) and all leaves belong to the original set. Similarly, a pair 
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from A* ^Ta belongs to the logical closure of an a-language iff there exists a 
finite proof tree where this pair is the root, every non-leaf node {w, A^) has 
a pair of children (w, Ba), {w, Co) where Ba and Cq, satisfy the relation (3.1) 
and leaves belong to the original language. Replacing every node (w, A^) of 
such a tree with (T^(w), Aa) by Lemma 6.1 converts it to a proof tree for the 
logical closure of T^ai^)- Thus if C is the logical closure of C, then Taq,(£) 
is the logical closure of TAa{^^)- To prove the opposite implication, note that 
due to the tautology 

h A A Ct V Bt^Dt CtW Dt, 

the relation 



h Ag — Bg A Aa — -Bq, V Ag — Cg A Act — Ca 

implies (3.1). Therefore replacing every node {Ag, Aa) of a proof tree for the 
logical closure of TAa(£) with {w,Aa), where w is the expression occurring 
in the root of the original tree, converts it to a proof tree for C, by Lemma 
6.1. □ 

Definition 6.3. A set C 7^ is said to be represented by a term M^t if 

Lemma 6.3. The logical closure Ai of a finite set Ai = {^a, ••• A^} is 
represented by the term M^t —def {= ^a) ■■■ V (= A^). 

Proof, h MatBa and h MatCa together with (3.1) imply h MatAa by the 
metatheorem 

h {Aa ^BaA MatBa) ^ MatAa 

(also apphed to Ca) and the tautology 

h {At V Bt) ADtAEt^ {At ADtVBtA Et). 

Thus, the set defined by h MatAa is logically closed and therefore, according 
to Definition 3.5, includes M.. The opposite inclusion follows directly from 
Lemma 5.3. □ 

Theorem 6.4. A logically closed language has a canonic representation iff 
it is a lexicon. 
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Proof. Let £ = ...iwi,A'J} and 

Lsat =def (= Al) &(= Al) ... V (= Ai) &(= A^), 

where A* =def TA{wi). According to Lemma 6.3, the term Lsat represents 
the logical closure of T^a(>C) and therefore, according to Lemma 6.2, also the 
logical closure JC. □ 



7 Recursive languages and representations 

Given a finite set of lexicons, one can obtain an arbitrarily complex logically 
closed language by recursively applying the operations of logical join, logical 
concatenation and application of semantic rules to the initial lexicons. In 
this section we will see how a compact Ti/n representation of such a recursive 
language can be built. 

Definition 7.1. Let Ci,...Cm be logically closed languages and let one of 
the following be true for every 1 < i < m: 

(i) Ci is a lexicon and Ll^.^ is its canonic representation 

(ii) there exist such 1 < < m and 1 < k{i) < m that Ci — Cj^i^U Ck{i) 

(iii) there exist such 1 < < m and 1 < k{i) < m that £j = Cj(^i)* Ck(i) 

(iv) there exists such 1 < < m that Ci = Cj(^i) > TZi, where a logically 
closed semantic rule TZi has a canonic representation R^.^.^ait 
range of 

The full set TZQ of these conditions comprises a recursive grammar for the 
languages Ci, ...Cm- 

For a given recursive grammar, let C denote the language tuple ...Cm), 
let C'^ denote a tuple of lexicons such that £° =def C^ whenever Ci is a lexicon 
and C^ — otherwise and let C' denote a language tuple such that C'^ — 
whenever Ci is a lexicon and C'^ is the right-hand part of the equality in con- 
dition (ii), (iii) or (iv) otherwise. Finally, let S denote the operator which 
produces C for a given C, so that the recursive grammar can be written as 

c^c^uic, (7.1) 

where U stands for a component-wise logical join of language tuples of the 
same arity and types. 
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Proposition 7.1. Equation 7.1 is satisfied by 

oo 

C^[jC\ (7.2) 

i=0 

where 

C = C^\JSC-^, for alli>l. (7.3) 

Proof. As the operator £ is a combination of logical joins, logical concatena- 
tions and semantic rule applications, it inherits the monotonicity property 

CCK ^ £CC£K, (7.4) 

where /C is a language tuple of the same arity and types as jC and C stands 
for component-wise inclusion. From this and 7.3, by induction on i it follows 
that 

C'dC'-\ foralH>l. 

Therefore, for any tuple e =def {{wi, ^aj, ... (w^, A^^)) there exists an index 
i{e) such that 

OD 

i=0 

where e stands for component-wise membership. Prom here, for the tuple C 
given by 7.2 we have that 

eeC'UiC ^ e e D = , 

that is, C^U£C C C. On the other hand, from the same property 7.4 we 
have that C^USCD C^USd for any i > and therefore C^uScD C. 
Thus, C^UiC = C. □ 

Note that in the general case, 7.2 might be not a single solution satisfying 
a recursive grammar. It can, however, be shown to be a subset of any other 
existing solution. This fact will not be used below and hence needs not be 
proven here, but we will refer to the solution 7.2 as the minimal one. 

Definition 7.2. For a given recursive grammar TZQ, let A-jig denote the set 
of m formulas Dj, ...-D™, where 

(i) Di =def {Ci^^t = ^L,J> if is a lexicon represented by L*,^.^, 
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(ii) D\ =def (CL,t = CS V CS), if A = £^(,)DA(,), 

(iii) =def (C*,^^, = Ci']^.^t * Cj^^.^J, if Ci = Cj(i)*Ck{i), so that =def 

(iv) =def (C;„^j = > i?^. if Ci = > 7^^ and 7^^ is repre- 
sented by 

all the constants ... C™^j are assumed to exist only in an extension 

Ty:l^~^, but not in Ty:^, and the Ty;f operators of logical concatenation and 
semantic rule application are defined as follows: 

{*(sat)(sl3t)sa(St) =def XXsat^VsfsAXsXXaXxis^ys^Zs 

{Xs = ys + Zs A Xsat Vs Xa A yspt Zs Xp), 

i>{sat){a/3t)s/3t) =def XXsatXyafStXXsXXfj3Xa{Xsat Xs Xa A ya^t Xa Xjj). 

Below we will also make use of the Ty^ operator of a functional semantic 
rule application: 

(^(sQ!t)(Q!/3)s/3i) =def XXsatXyapi^sat > {.XXa{= yap Xa)))- 

Note that every sing leton (= A\) &(= Al) can be equivalently written down 

as (= A]) ^ Al 

We now prove that every language Ci of the minimal solution of a recur- 
sive grammar TZQ is represented by (At^^, C*^^.J. To do so, we introduce the 
following auxiliary 

Definition 7.3. An a-language C is said to have a pseudo-canonic represen- 
tation if it is represented by (A. L^at) such that for any word w & A* there 
exists a finite (may be empty) set {A]^, ... Aj^} of Ty;^ terms such that 

I 

AhLsatTA{w)^\/{^ Al) 

i=l 

and adding A to Ty^^ as new axioms forms a conservative extension of Ty:^ 
(i.e. for any Ty;^ formula At \- A^ iff Ah ^4^). 

Note that a canonic representation of an ct-lexicon is a particular case of 
a pseudo-canonic representation. 
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Lemma 7.2. If A is a consistent set ofTy:^^ formulas and for any general 

model M ofTy:^ and an assignment of variables in that model there exists an 
extension of M for Ty:^^ in which A is satisfied by the same assignment 
of variables, then adding A to Ty:^^ forms a conservative extension ofTy:^. 

Proof. By the Completeness Theorem, if not h A^, then there exists a general 
Ty^ model M and assignment of variables a by which At is not satisfied in M. 
If M can be extended to a model for Ty:^~^ where A is still satisfied by 
a, then by the Soundness Theorem the derivation A\- At is impossible. □ 

Lemma 7.3. The set {^^, ... A'-J appearing in the first condition of Defini- 
tion 7.3 generates all meanings of w in the language C 

Proof. If Aa is a Ty;f term and {w,Aa) G C, then A h Lgod Ta{w) Aa and 
therefore 

Ah{A^=Al)... y{A^ = A'j, 

which implies that 

hiA^ = Al)... V(A, = 0. 
Thus, by Lemma 6.3, any meaning of w belongs to the logical closure of 

Lemma 7.4. In any recursive grammar TZQ , a lexicon Ci with a canonic 
representation L\^^^ also has the pseudo- canonic representation (A-y^g, L*^.^). 

Proof. The first condition of Definition 7.3 follows directly from the assump- 
tion that L\^.^ is a canonic representation. 

Now we prove that the set A-jig is consistent and also satisfied by a given 
assignment of variables a in some extension M+ of an arbitrary Ty;^ model 
M. Lemma 5.2 and the fact that a recursive grammar is satisfied by some 
logically closed languages Ci, ... Cm allow the extension M"*" to be defined as 
follows: let the function |C!,^^Jm+ for given arguments w eVs and v G Pq,. 
take the true value iS w E A* and the language Ci contains pairs 
such that |[^QjM+,a — 'V- The conditions (i)-(iv) of Definition 7.2 then imply 
that every Dl is satisfied by a in such an extended model. By Lemma 7.2, 
this proves the second condition of Definition 7.3. □ 

Lemma 7.5. // M. and J\f are logical closures of M. E Ta and Af & %, 
respectively, then MUj\f — MUAf. 
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Proof. Every element of MUAf belongs to MUj\f because M Q M, 
J\f C a/", so a proof tree for any term from Ai U Af is also a proof tree 
for Ai U AT. On the other hand, as every leaf of a proof tree for an element 
of U AT belongs to either Ai or AT, it can be expanded to a proof tree for 
the corresponding set and thus the entire tree can be converted to a proof tree 
for A4 U A/" with the same root, that proves its membership in A4 U A/". □ 

Proposition 7.6. Let a-languages C and /C have pseudo- canonic represen- 
tations {A,Lsat) (ind {A,Ksat) respectively. Then their logical join CUJC is 
represented by (A, Lsat^ Kgat), which is also a pseudo-canonic representation. 

Proof. Let 

I k 

A h Lsat Ta{w) = \/(= K) and A h K^at Ta{w) = \/(= K)- 

i=l i=l 

Then 

I k 

A h (L,,, V K^^t) Ta{w) = \I{= A^J V \l{= Bi). 

1=1 i=l 

This proves that (A, Lgat V Kgat) is a pseudo-canonic representation and 
therefore for any Ty:^ term Aa, 

I k 

Ah{LsatyKsat)TA{w)A^ ^ h \/ {A^^ = A^^) V \/ {A„ = B^) . 

i=l 1=1 

By Lemmas 7.3, 6.3 and 7.5 this proves that (A, Lsat V Ksat) represents 
CUJC. □ 

Proposition 7.7. Let an a-language C have a pseudo-canonic represen- 
tation (A, Lsat) « (3-language K. have a pseudo-canonic representation 
{A,Ksi3t). Then the logical concatenation C*K, is represented by 
(A, Lsat * Kg^t), which is also a pseudo-canonic representation. 

Proof. Let w = ho + to = ... = hi + ti be all possible splits of a word w into 
head and tail and 

A h Lsat TA{hm) = V(= ^"')' ^ ^ ^^(^-) = V(= 

i=l j=l 
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(for m = 0, ... 1). Then by axioms 4.1 - 4.3 and metatheorems 



(7.5) 



h 3x^{x^ ^ Ay A Bt{xj)) = Bt{A^) 



(7.6) 



(where x^ is not free in A^) we have: 



I Ifji km 

A h (L,., * X,^,) Ta{w) =\I\I\I{= ad &(= 



m=0 i=l j=l 



This proves that (A, Lsat*Kspt) is a pseudo-canonic representation and there- 
fore for any Ty:^ terms A^, B^, 



By Definition 3.7 and Lemma 6.3 this proves that (A, Lsat * Kgpt) represents 



Proposition 7.8. LeA, an a-language C have a pseudo-canonic representation 
(A, Lgat) and a rule IZ ciTa®Tp have a canonic representation Rapt in the 
range of C Then the language jC>TZ is represented by (A, Lsat^T^apt), which 
is also a pseudo-canonic representation. 

Proof. Let 



Ah(L,„,*i^,^i)^A(«^) AaBp 



I Im km 

^\J\J\I{K = A^^^)A{B, = BJ^^). 



m=0 i=l j=l 



□ 



^^LsatTA{w)^\l{^ A^) 



1=1 




(for i = 1, .../). Then by metatheorem 7.6 we have 



A h {L,at > n^f^t) Ta{w) = V V(= ^^'')- 



1=1 i=i 
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This proves that (A, Lgat > "T^a^t) is a pseudo-canonic representation and 
therefore for any Ty;^ term B^, 

I hi 

1=1 j=i 

By Definition 3.3 and Lemma 6.3 this proves that (A, Lgat^T^apt) represents 
Ci>n. □ 

Propositions 7.6 - 7.8 estabhsh that, given some logically closed languages 
known to have psciido-canonic representations that share a single common 
axiom set A and semantic rules that have canonic representations, one can 
build Tyn representations of their logical joins, logical concatenations and 
applications of the rules by means of the three corresponding Ty:^ operators: 
{sat){sat)sat) , (* ) and (>(sat)(Q;/3t)s/3t); moreover these representa- 

tions are also pseudo-canonic and share the same axiom set A. 

Lemma 7.9. Ty^ operators of logical join, logical concatenation and seman- 
tic rule application reproduce the associativity and monotonicity properties: 

l~ {^sat V ysat) * Zspt — ^sat * ^s/9t V ygat * ^s/3t) 

^ ^spt * {^sat V ysat) — ZgjSt * Xgat V Zg^t * Usatj 

l~ (Xsat V ysat) > Za/St = Xgat > ^a/3t V ysat > ^a/3t, 

l~ i^sat ~^ Usat) ~^ i^sat * ^S/9t ~^ Usat * ^s/3t) ; 
^ sat ^ Usat) ~^ {ZsPt * Xsat ^ Zspt * ysat)j 

I- {Xsat Vsat) {Xsat > ^a/3t ^ Vsat > ^a/3t)- 

Proof. All these properties follow from the definition of operator {*(sat){s/3t)sa(3t) 
due to the tautologies 

^{AtyBt)ACt = AtACtWBtACt, ^CtA{AtVBt) = CtAAWCtABt, 

h (At ^ Bt) ^ (AtACt ^ BtACt), h {At ^ S,) ^ (Ci A A ^ AS,) 
and the metatheorems 7.5 and 

□ 
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Theorem 7.10. Every language Ci of the minimal solution of a recursive 
grammar TZQ is represented by {A-jig , Cl^.^) . 

Proof. If (using the notations introduced in the proof of Proposition 7.1) 

e E jC, then there exists an index i such that e G /I*. As every jC^ is built from 
lexicons /iP and jC'^''^^^ by applying logical joins, logical concatenations and 
semantic rule applications, from Propositions 7.6 - 7.8 it follows (by induction 
on i) that every component of jO has a pseudo-canonic representation 
(At^^, L'^T^p, where for any i > 

=def V E^...^^ 

Tj —def sajt and terms El^___^^^. stand for Ty;;^ operators corresponding to 
the operator £. On the other hand, from Definition 7.2 we have 

As by Lemma 7.9 the terms El_^___^^^. reproduce the monotonicity property 
of the operator £, it follows from this that for alH > 

and therefore, for every j that 

A^0hCi.r^(«;,-)Aa,. 

To prove that, inversely, the above derivations imply e e /Z*, consider the 
Ty^^ model defined in the proof of Lemma 7.4. As such a model can extend 
an arbitrary Ty:^ model, these derivations imply that there exist such terms 
that e' =def ((wi,^^J, ... {w^^KJ) e £ and h A'^. = A^.. As are 
logically closed, this imphes that e & C also. □ 

Corollary 7.11. Every language Ci of the minimal solution of a recursive 
grammar TZQ is represented by the term XxgXxaiDf — >■ C^^.^, where Df is 
the conjunction of all formulas from Ajig . 

Proof. See the note to Definition 4.1. □ 

The significance of these results is that they, in particular, estabhsh a 
class of non-trivial Ti/„-representable languages which turns out to be decid- 
able, in spite of the fact that Ty^ is undecidable and hence so are generally 
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Ty„-representable languages. Indeed, assume all the semantic rules partic- 
ipating in a grammar TZQ to be non-dcgcnerate. It is easy to see that in 
this assumption the domain of every language Ci is formed by a context-free 
grammar with the set of terminals being the union of domains of all the 
lexicons participating in TZQ and non-terminal symbols corresponding all the 
Ci languages. Therefore, this domain falls into the class of context-free (and 
hence - recursive) languages of the Chomsky hierarchy. Then, in the addi- 
tional assumption of decidability of all the TZQ semantic rules (which holds, 
for example, for any functional rules), the problem of finding all the logically- 
independent representatives of the meanings of an Ci expression also turns 
out to be decidable, merely by attributing the terminal (leaf) nodes of its 
syntactic parsing tree with their (lexicon-determined) meanings and inherit- 
ing and evaluating the meanings of upper nodes according to the rules, from 
bottom up to the root node. Furthermore, though discarding the assump- 
tion of non-degenerateness of all the TZQ semantic rules may generally lead 
to languages Ci with domains being context-sensitive (in the classical sense) , 
retaining only the assumption of the rules decidability still preserves the con- 
clusion about decidability of languages C^. In this assumption languages Ci 
can be parsed, for example, by the two-staged process like this: 

1. parse according to the context-free grammar TZQ obtained from TZQ 
by replacing every rule TZi by the trivial non-degenerate rule with the 
canonic representation R\.^^^ait =def Ax^^^.^ Ax^^T 

2. attribute the terminal (leaf) nodes of the parsing tree with their (lexicon- 
determined) meanings and inherit and evaluate the meanings of upper 
nodes according to the actual rules T^j, or discard the parse if at any 
node the actual rule degenerates to Axq,.F when applied to all incoming 
meanings. 

To illustrate the technique of a recursive grammar representation, let us 
build a very simple English-like grammar from tiny lexicons, consisting of a 
few countable nouns and transitive verbs, the irregular verb "be", a proper 
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Verbts(eet)t 

VerbBes(eet)t 
Nameset 

Dets((et)(et)t)t 



noun and two determiners: 

Noun5(et)t = (= /builder/) =^ Buildergt V 
: /house/) Houseet V 
: /car/) =^ Caret, 
: /build/) =^ Buildeet V 
: /sell/) Selleet, 

: /be/) ^ (=eet), 

: / Jack/) =^ Jacke, 
/a/) ^ XyetXzet^Xeiy X A zx) V 
: /every/) ^ (^►(et)(et)t)- 
The noun phrase can be defined by the axiom 

NPs((et)t)t = Det ** Noun =^ I((et)(et)t)(et)(et)t V 

Name =^ XxeXyetiv x) 

and the simple verb phrase (for the object of the third person in the singular 
number) by the axioms 

VPs,(et)t = (Verbt * (= /s/) V ((= /is/) ^ /be/) > VerbBe) **NP ^ 

XzeetXy(et)Axl{y Xx^{z x^ x^)) V 

((= /does not/) ** Verbt V ((= /is not/) =^ /be/) > VerbBe) **NP 

XzeetXy{et)tXxl{y Ax^(~ (2; x^))), 



where 

(**) =def >^XsaAyslSt{x *(=//)*?/) 

represents phrase concatenation (with a separating space). Note that the con- 
stant VerbBe occurs here as a semantic rule applied to simple s-languages 
which express the morphology of the verb "be". As these languages, as well 
as the language represented by VerbBe itself are finite lexicons, the results 
of the applications are lexicons too and hence conditions of the theorem 7.10 
still hold for the language we are defining. The compound verb phrase can 
now be defined by the axioms like 

VPm,(et)t = VPS**(= /and/)**VPs ^ (A(et)(et)et)) V 
VPs * (= /,/) ** VPm ^ (A(et)(et)et), 

yPs(et)t = VPs V VPm 
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and the clause - by the axiom 

Clstt = NP ** VP I((et)t){et)t- 

Finally, the declarative sentence can be defined by the axiom 
Sstt = C\*{= /./) V (Cl*(=/,/)**S V Cl**(=/and/)**S)^(Ai«)- 

If As is the set of the above axioms, then, by propositions 7.6 - 7.8, we will 

have, for example: 

As \- S /Jack builds a house./ (Build Xe Jack A House Xg), 

As \- S /Jack sells a car and builds a house./ (3xe(Sell Jack A Ccir Xe) A 

3xe(Build Xg Jack A House Xg)), 
As \- S /Jack does not build a house./ ~ (Build Xg Jack A House Xg), 
As h S /every builder builds a house./ (Builder — )■ AXg3Xg (Build x^^ x^ A House Xg), 
As \- S /Jack is a builder. Jack builds a house./ (3a;e (Builder Xg A Jack = Xe) A 

Bxg (Build Xg Jack A House Xg)). 

Of course, along with these, the language represented by (A5,S) also ad- 
mits semantically infelicitous sentences like, for example, "a house builds a 
builder" or " Jack is a house" . If one would hke to make such sentences un- 
grammatical, i.e. exclude them from the domain of the language represented 
by (A^, S), this can be achieved by applying more restrictive relational se- 
mantic rules, than those simple functional rules used in the above axioms; 
such rules must account the animacy categories of nouns, as well as categories 
of subjects and objects applicable to a verb. Another, less elegant, though 
practically may be simpler and more efficient solution is to split lexicons 
and derived languages to semantically homogenous components and allow 
concatenation of only matching pairs, like, for example: 

Cl,u = (NPa** VPa V NPi** VPi) ^ \em(et)t, 

where NPa, NPi are to represent animate and inanimate noun phrases and 
VPa, VPi to represent verb phrases applicable to animate and inanimate 
subjects. Note that this approach leads to splitting the verb lexicon to com- 
ponents corresponding different generic sentence frames as defined in (Fell- 
baum, 1998); in particular, the verbs "build" and "sell" , in terms of WordNet, 
will correspond the frame "Somebody — s something" and the verb "be" will 
be shared by the frames "Something — s something" and "Somebody — s 
somebody" . 
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8 Context type, instructive and 
context-dependent languages 

Introduction of an additional sort of individuals in Ty2 of (Gallin, 1975) al- 
lowed to accurately interpret in this theory the intensional logic {IL), that 
implied interpretation of the corresponding type (s in Gallin's notations, not 
to be confused with the symbohc type of this work) as a set of possible "states 
of world" or "states of affairs" . Similarly to this, Ty:^ with n > 2 admits 
the same interpretation of one of its primitive types (let it be, for certainty, 
c =def e„_i) and thus turns out to be capable of representing intensional 
languages, i.e. languages with intensional semantics, as well. In combination 
with the capabilities provided by the symbolic type, however, a more specific 
interpretation of type c in Ty:^ allows to go far beyond this straightforward 
generalization and achieve also a natural and effective account of those as- 
pects of a language which are usually referenced in literature as dynamic 
semantics (Kamp, 1981; Heim, 1983; Seuren, 1985; Groenendijk & Stokhof, 
1989; Groenendijk & Stokhof, 1990). 

The basic idea is to extend Ty:^ with non-logical axioms like 

Aa;a(Derefccc Sa {Pec (Setref cq,cc Sa Xa 2/c))) = laa (^-l) 

Sa O Pec O Setref QQcc Sa Xa — Ico 

(8.2) 

where there could be some restrictions on the structure of terms Sa and Pec 
and 

(°(/37)(a/3)a7) =def p-^Xqafj^X a{z p.^ {qajB Xa)), 

that allow to interpret type c as a "store" (Sabry, 1995) or, in the terminology 
of programming languages, "state" and transform a Ty„ representation of a 
language to the state-passing style (SPS) (Wadler, 1992; Sabry, 1995; Jones, 
2003). In such a transformation the state actually plays a role of a context, 
passing "side effects" of parsing some constituents of an expression to others, 
as we explore in details below. This justifies our referring to the type c as to 
context type (and motivates the notation). 

Note that cc-terms like Setrefaacc Sa Va and Unsetacc Sa present in- 
structions to "modify" a context, in the sense defined by the axioms 8.1, 8.2. 
Proceeding from this observation, one can further assume certain cc-terms to 
denote instructions whose semantics might not be representablc in Ty^, but 
still could be implemented in a computing or another system. Instructions 
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to add or remove a non-logical axiom to or from a set of axioms associated 
with a context, or verify whether a given formula is provable from this set are 
important examples of such "meta" -instructions. Although their semantics 
are not representable within T|/„, they still might be denoted by cc terms, 
like, say, Asserttcc ^t, Refute^c or Testae ^t, and there could be a 
computing system capable of evaluating them in some conditions and pro- 
viding an appropriate feedback in the event of failure. Other unrepresentable 
instructions might force a system to exchange information with an external 
environment via input/output devices or even to execute certain physical 
actions (like robotic ones). 

It is easy to notice that associating a context with a set of non-logical 
axioms makes our c type conceptually similar to the World type of (Pollard, 
2005). The principal technical (formal) difference is however that the World 
type in the hyperintensional semantics is defined as a derived subtype of the 
type of functions from Prop to Bool, while Prop is introduced as a primitive 
type. Unlike that, in our model, vice- versa, the context type c is introduced 
as primitive and the propositional type can be defined as the functional type 
ct (see (Gallin, 1975)). A fundamental theoretical benefit of the hyperin- 
tensional semantics is that it admits internalizing (i.e. making representable 
within the HOL) the entailment relation between propositions, along with 
the property of a set of propositions to be an ultrafilter, i.e. a maximum 
consistent set of propositions. However, as the World type is defined just by 
the ultrafilterhood predicate, i.e., informally saying, assumed to "settle every 
issue" (Pollard, 2005), it is not obvious whether this theoretical benefit would 
imply practical benefits in an implementation of this formalism. At the same 
time, our context type, of course, enforces no assumptions about complete- 
ness (and even about consistency) of axioms associated with a context, that 
seems to be more pragmatic and realistic. 

To summarize this introductory informal discussion, we can state that, 
while type s of (Gallin, 1975) and type World of (Pollard, 2005) stand for 
"states of world" , the similar to them type c of our model stands rather for 
"states of an individual mind" , which mind is not necessarily complete nor 
consistent, but, on one hand, still determines interpretation of a text and, 
on the other hand, is changeable (in particular, extendable) in the process of 
interpretation. 

Moving on formalism, we first note that a superposition like 

P^ o P'-^ ... o P\ 

CC CC CC 
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represents a sequence of instructions, executed in order from right to left, that 
is a program, while the identity I^c is the empty sequence, or "do-nothing" 
instruction. (Of course an arbitrary cc-term would not invariably represent 
an executable instruction, but a real implementation should necessarily pro- 
vide some sort of exception handling, reducing a non-executable instruction 
to another or possibly do nothing instruction.) 

Thus, expressions of a cc-language mean some (either executable or non- 
executable) instructions and therefore a text, i.e. an ordered sequence of 
such expressions can be treated as a corresponding program. A few obvious 
features of such an instructive language make it different in principle from a 
t- and even a ci-language: 

1. an instructive language text meaning captures the order of statements, 
that is, its semantic interpretation can depend on that order; 

2. it can also capture contradictory or equivalent truth meanings nested 
in subsequent instructions; 

3. it provides a natural way to distinctly represent declarative, interroga- 
tive and imperative semantics. 

Given an initial context Cc, an instructive language text meaning Pec 
determines a new context Pec C'c depending on both the initial context Cc 
and the program Pec which changes it, in the sense that a continuation of the 
text would already be applied to the new context. However, the program Pec 
itself here does not depend on the initial context. Our next step is to employ 
the context type for building context-dependent languages. 

Definition 8.1. A context-dependent a-language is a c(c x ci;)-language. 

Given a context-dependent a-language C and a context term Cc, the 
operation of conteoct application 

CiCc =def C>CA(Cc), 

where the rule CA{Cc) C Tc{cxa) ® %xa is represented by 

CA(^cc){ca)cat{Cc) —def ^^cc^Ucaii— ^cc C'c) &(= Uca Cc)), 

and the operation of context instantiation 

Ci^Cc =def C>CI{Cc), 
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where the rule CX(Cc) C Tc{cxa) <8) 7^ is represented by 

C I (^cc)(ca)at (Cc) — def ^^cc^Ucai— TJca C'c), 

produce a c x ct-language and a regular a-language, respectively, that depend 
on a context Cc- Inversely, given a regular a-language and a term Face, the 
operation of context raising 

>C ft- Face =def > CTZ{Facc), 

where the rule CTZ{Facc) C Ta^ Tc{cxa) is represented by 

produces a context-dependent ct-language. Note that for any language and 
context 

h (£ ^ XXalcc) i^Cc = C, 

that is, raising a regular language by Xxalcc converts it into a context- 
dependent language which however is essentially equivalent to the original 
one. 

It is also noticeable that an instructive, that is a regular cc-language, 
turns out to be a singular case of a context-dependent unit-type-language. 
The reason why a context-dependent a-language is required to be of type 
c(c X a), rather than of a simpler type ca, should become obvious from the 
following. For sake of brevity we employ notation a —def c(c x a) (for an 
arbitrary type a) everywhere below. 

Definition 8.2. Let £ be a context-dependent a-language and /C - a context- 
dependent /3-language. Then their anaphoric logical concatenation is the 
context-dependent a x /3-language 

JC —def C'*JC > AC, 

where the rule AC C T^x^s ® 'T^oTp represented by the term 

^C^-^^t =def \xl^\yla\xl^\y%{{= xl^ o xl^) &(= &(= y^ o xjj), 

and their cataphoric logical concatenation is the context-dependent a x (3- 
language 

where the rule CC C T^x^ ® '^^x^a represented by the term 

CCa-^^t =def Xxl^Xyla>^xl^>^ylf^{{= o xlJ &(= yla o xlJ &(= yl^)). 
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For informal interpretation of this definition consider the superposition 
of an anaphoric concatenation with a context apphcation: 

(£ 1^ /C) 4 Ce = (£ * /C) > {CA{Cc) o AC), 

where the rule superposition CA{Cc) o AC is represented by the term 

XxlXyla>^4MA= ^cc i4c Cc)) &(= vL Cc) &(= ylfs i^lc Cc))). 

Thus, the anaphoric concatenation affects a meaning of the right-hand con- 
stituent by a context affected by the left-hand one, while the latter is affected 
directly by the initial context. Symmetrically, for the cataphoric concatena- 
tion, the effective rule superposition is represented by the term 

\xl\yla^4M^i{= 4c {<c Cc)) &(= vL {^c Cc)) &(= y% C,)), 

that is, in this case an initial context is passed to the right-hand constituent 
and the affected context to the left-hand one. 

Note that the construction of type a =def c(c x a) coincides with the 
construction of the state monad type in functional programming languages 
(Wadler, 1992; Jones, 2003) and the context raising rule 

CRa{cc){ca)t{XxQlcc) =def XXa{{= Ice) &(= XXcXa)) 

precisely corresponds to the return operator of that monad. It is also easy 
to verify that the rules AC-^-^-^^^ and CC-^-^-^^^ defining operations and 
V come as results of binding by the corresponding operator of the same 
monad, in the first case - a meaning [x]^, yl^) of the left constituent with the 
operation 

^^aTIx^ ~def XXa{{x^^, Xy^Xa, y^s) 

and in the second case - a meaning {x^^, y1^) of the second constituent with 
the symmetric operation 

CK^^ =def >^Xfi{{xl^, Xy^xp, yl^), 

which operations merely combine a monadic-type-meaning of one of the con- 
stituents with a regular-type-meaning of the other. Of course, this note is 
nothing else than a rephrasing of the above informal interpretation of oper- 
ations ~^ and V, in terms of the state monad. 

As the new operations V,4-, ft" and JJ- are defined as rule applications 
to regular concatenations or just special rule applications, they also reveal 
associativity and monotonicity properties: 
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Lemma 8.1. For any context-dependent a-languages JC, C, a context-dependent 
^-language M., regular a-languages JCqjCq and a context term Cc 



xV(;cD£) = m'^kum'^c, 

(/CoU£o)tC^c = /Co^CeD£o^C„ 
(/CD£) ^Cc = /C^CcU£^Cc, 

/Co C Co =^ /Co 1T Cc C £0 1T Cc, 
K, d C /Co JJ- Cc c >Co JJ- Cc- 

Proposition 8.2. //a context-dependent a-language C has a pseudo-canonic 
representation (A, Lgat) o,nd a context-dependent ^-language K, has a pseudo- 
canonic representation (A,ir^^J, then the concatenations K, and K, 

have pseudo- canonic representations (A, Lgm ^ -^s^J ^.nd ((A, Lgm ^ ^s^t) 
respectively, where 



[Xs = X^ 



l + x^AXcc = xl^oxl^Ayc^ = yl^oxl^Az^atX^xl^yca^z^-^^xlxl^yl^ 



( * {sm){si3t)saxi3t) — def ^Zsm^Zg|^^XxsXxccXyca^ycp3xl3xl3xl^3xl^3yl^ 

{Xs = xl+X^AXcc = xJc°^cc/\?/ca = yL°^lc ^ ^sat^lxiy^^ A Z^-^^xlxl^yc/s) ■ 

Proof. The proof follows from Propositions 7.7, 7.8 and Definition 8.2, taking 
into account that the terms ^C^^^^^ and CC-^-^^^ are canonic represen- 
tations of the corresponding rules (in any domain). □ 
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Proposition 8.3. // a context-dependent a-language C has a pseudo-canonic 
representation {A,Lsat), then a context instantiation £ JJ- Cc has a pseudo- 
canonic representation {A,Lsat JJ- C'c); where 

{,^{sai)csat) — def ^Z^sat)csat^ZcXXsXyo3XccJyca(^Zsat -^s -^cc Ilea ^ Va — Vca Zq). 

Proof. The proof follows from Proposition 7.6, taking into account that the 
term CAacat{Cc) is a canonic representation (in any domain) of the rule that 
defines the context instantiation operation. □ 

Proposition 8.4. If an a-language C has a pseudo-canonic representation 
{A^iLsat), than a context raising C Face has a pseudo-canonic representa- 
tion (A, Lsat t -^acc); where 



("fr (sat) (acc) sat ) 



(^sat ^a ^ "^cc Z, 



acc -^a ^ -^ca 



Proof. Similar to Proposition 8.3, with taking into account that the term 
CRaat is a canonic representation (in any domain) of the rule that defines 
the context raising operation. □ 

Lemma 8.5. Ty;^ operators of logical join, anaphoric or cataphoric concate- 
nations and context instantiation or raising reproduce the associativity and 
monotonicity properties: 



{Xsat V ysat) ^ ^sFt 



Xsat * Zgp^ V ysat * Z^^^^, 



— > 



\- (Xsat V ysat) ^ Z^ 



* i^sat V ysat) — Z^i^f. * Xsat V ^^.^^ * ysat, 



h Z. 



s/3t ~ ^sat * ^s/3t V ysat * Z^p^, 



I3t 



* {Xsat V ysat) — Z^p^ * Xsat V Z^^^ * |/sat) 



^ (-f-sat V ysat) lT Zacc -^sat il Zacc V ysat il 



^ (-^sat ^ Vsat 

^ {-^scit ^ Vsat 

^ i-^sat ^ Vsat 

^ i-^sat ^ 1/sat 

l~ (-^sat ^ ^sat 

^ (-^sat ^ |/sat 





■^sat 




i^sat * 




(^s^f ^ 




[^sat * 




i^s]3t ^ 


— > 


(•^sat fr 




(^sat ■^1' 



^s/3f ~^ Vsat ^ 2;^^t)) 
-sat ^s]3t * l/sot). 



-'sat 



s/3t ^ 1/sot) 



35 



Proof. The proof follows from the definitions of the operators and Lemma 
7.9. □ 



Propositions 8.2 - 8.4 and Lemmas 8.1 and 8.5 show that the main re- 
sult of Section 7 - Theorem 7.10 - can be generalized in a straightforward 
manner to the case of a recursive grammar that contains a mix of regular 
and context-dependent languages and relations involving all seven operations 
U, *, V, >, -fl^ and JJ.. 

Let us see how the sample recursive grammar built in the previous section 
can be converted to an instructive (and hence context-dependent) language 
grammar and extended to acquire the two important features: distinction of 
declarative and interrogative semantics and pronoun dereferencing. To avoid 
excessive technical complexity and make the example more demonstrative, 
we will allow pronoun "he" to appear only on place of a clause subject, 
referencing the subject of another (previous) clause expressed by a name, and 
pronoun "it" only on place of an object, referencing the object of another 
(previous) affirmative verb phrase. The first can be achieved by defining the 
subject noun phrase by the axioms like 

= (Det**N0Un =^ I((et)(et)t)(et)(ef)t) lt Aa;(et)tlcc V 

(Name \xeXyet{y x)) f Setref He(et)t) V 
SP ron =^ (Ice, Deref ((et)t)c(et)t)) 
SPron(et)t = (= /he/) =^ lie(et)t- 

Note that the constant He(et)t here does not denote any specific meaning, but 
plays role of a symbol to assign a meaning and carry it to a corresponding 
reference. 

Accurate treatment of the pronoun on place of an object (which we want 
to be capable of referencing an arbitrarily expressed object of another verb 
phrase) requires a more sophisticated technique. First, we specialize the type 
of the object noun phrase to {eet)et and define it by the axioms 

ONP,(^, = (NP ^0)t AX(eet)etIcc V (8.3) 

OPron =^ (Icc,Deref((eet)et)c(eet)et), 
OPron(eet)et = (= /it/) =^ lt{eet)et, 

where NP is defined the same way as in the previous section and 

0{{et)t){eet)et =def An(et)t AVeetA|/e(n XXe{v X y)) 
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simply maps its {et)t meaning to {eet)et. Note that these definitions do not 
yet provide setting the It reference. This is because in general case it should 
be dereferenced to a meaning determined not only by the object noun phrase 
itself, but also by a verb applied to it. For example, "it" referencing the 
object in the verb phrases "builds a house" and "sells a house" should be 
dereferenced to different meanings: "a house which is built" and "a house 
which is sold", respectively. Therefore, our It reference can be set only upon 
processing the entire verb phrase, as captured in the following axiom: 

VPs,itt = Yerht*{^ /s/)**ONP ^ {Setit, Apply) V (8.4) 
((= /is/) ^ /be/) > VerbBe) ** ONP ^ 

{XVeeAXccXyc(eet)eticc: Apply) V 

((= /does not/) **Verbt V ((= /is not/) =^ /be/) > VerbBe) =^ 

~ Apply), 

where 

Set It =def XVeeAXccXyc{eet)eAZc{Setref It XUeetiU {x z) {v Au))), 
Apply =def XVeetXXccXyc{eet)etXZc{y {x z) v). 

We shall see below how interaction of axioms 8.3 and 8.4 makes the correct 
semantics, but so far have to complete our language definition. 

The context-dependent versions of the compound declarative verb phrase, 
clause and sentence can be defined straightforward by the axioms 

VPS«(= /and/) Jl VPs ^ p {Aiet)(et)et)) V 
VPs * (= /,/) Jl VPm ^ p {Aiet)(et)et), 

VPs V VPm, 

SNPJIVP -»pl((et)t)(et)t, 

CI *(=/./) V (CI* (=/,/) it DS V Cl**(=/and/)JtDS) ^p(A«t), 
(*l) =def XxsatXy,-^t x*{= / /)^y, 

P{ali)a'p =def Xz^fiXyccXyca {Vcc, Xx^ {z {yea x))). 



VPm,^, 



VP 

^ s{et)tt 



where 
and 
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Now, to add the simplest ("yes/no") interrogative sentence to our lan- 
guage, we need the axioms like 

Verbt ** ONP {Setit, Apply), 

ONP =^ {XXccXyc{eet)edcc, Apply (=eet)), 

((= /does/)**SNPJllVPdo V (= /is/) ** SNP Jl IVPbe) ^ 

P I((et)t)(et)t, 

IC1*(= /?/). 

Finally, we wrap the declarative and interrogative sentences into the general 
sentence of the instructive type: 

Ss(cc)t = DS =^ AyccAyct(Asserttec o Va o Vcc) V (8.5) 
IS ^ AyccAyct(Testtcc o Vet ° Vcc)- 

If now As denotes the new set of axioms, then, due to the results of this 
section and in the assumption that Ty:^ has been extended by the axioms 
8.1, 8.2, we will have, for example: 

h S / Jack is a builder./ 

(Assert (Builder Xe A Jack = Xe) o Setref He Jack), 
As \- S /is Jack a builder?/ 

(Test (Builder Xe A Jack = Xe) o Setref He Jack), 
As \- S /Jack is a builder, he builds a house./ 

(Assert (Sxg (Builder A Jack = Xe) A 

(Build Xe Jack A House Xe)) o Setref He Jack), 
As \- S /Jack builds a house and sells it./ 

(Assert (Bxg (Build Jack A House Xg) A 

3xe((BuiId A Sell) Xe Jack A House Xe))o 

Setref It AueetAye3xe( (Build A Ueet) He A House Xe) o 

Setref He Jack) . 

Due to the metatheorem 

h {3XeA A 3Xe{A A B)) = 3Xe{A A B), 



IVPdo,3, 
IVPbe,^, 



IS 



stt 
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where A and B are arbitrary formulas, the last derivation also implies a 
shorter form 

As \- S /Jack builds a house and sells it./ 

(Assert (3xe((Build A Sell) Jack A House Xe))o 
Setref It AueetAye3a:;e( (Build A i^eet) Z/e A House Xg) o 
Setref He Jack) . 
Similarly, we would have also 

As \- S /Every builder builds a house and sells it./ 

(Assert (Builder — >■ Aye3xe( (Build A Sell) Xe ye A House Xe)) o 
Setref It AiieetAye3a:;e( (Build A UeetjXeVe A House Xe)). 

The last examples give some hints on how this formalism could address don- 
key pronouns (Geach, 1962; Kamp, 1981; Heim, 1982; Elbourne, 2006), re- 
maining entirely within the mainstream higher-order logic semantics. Of 
course, elaboration of this topic deserves a separate publication. 

Note that, due to the special semantic rules applied in the axiom 8.5, the 
sentence meaning captures not only an assertion or test instruction, but also 
all the " side effects" of its reading. This is essential in order to pass the side 
effects between sentences, if we would like to further define a language like 

Texts(cc)t = S V SJ^Text =^ {o{cc){cc)c(^^ 
so that, if As contains this last axiom too, then we would also have 

As \- Text /Jack is a builder, he builds a house./ 
(Assert 3xe(Build Xe Jack A House x^) o 
Setref It AueetAye3xe( (Build A UeetjXeVe A House Xe)o 
Assert 3xe(Builder Xe A Jack = Xe) o 
Setref He Jack) . 

9 Translation/expression operators and 
partial translation 

Let an a-language C be represented by a term Lsat and let w & A* he a, 
non- ambiguous valid expression of this language, so that 

h 3iXa{Lsat /w/ Xa). 
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Then, by the basic property of the description operator i(^at)a, 

I" f'{at)a{Lsat /wf) = A^, 

where {Aa} is any generator of the meaning of w in £ (for a product type 
Q; =def /5 X 7 here and below L(^at)a denotes the pair 

(A^/37t('-(/3t)/3 \x^3x^{Zi3^t X^)), Xzp^tiHlt)^ Xx^3xi3{zp^t Xfi x^))) (9.1) 

which definition may be apphed recursively). Similarly, if {^a} is a generator 
of a meaning which has a single expression w in £, then 

h i(st)sAxs(Lsat Xs A^ = Iwj. 
Thus, given an a-language represented by a term L^^j, the terms 

'^(so£)s(x — def Xx scdXx s{^l(^cxt)(x {Xgod, Xs)) 

and 

^{scd)as — def Xx s(xt\x (x{l(si>js XXgiXgcd Xs 2^a)) 

enable Tyn representations of expression-to- meaning and meaning-to-expression 
mapping determined by the language, in forms like T(^sat)sa Lsat and a(^sat)as Lgat 
respectively. They are referred to below as translation and expression oper- 
ators. 

Application of the translation operator to an invalid expression, of course, 
is not provable to be equal to any meaning. However, it still represents a 
meaning that the expression might acquire upon possible completion of the 
language definition. Here is a precise formulation of this statement: 

Proposition 9.1. Let (A, Lgat) be a pseudo-canonic representation of a lan- 
guage C and 

^tat ~def Lsat ^Xs(— T(^sat)sa Lsat Xg)- (9-2) 

Ifw is either a valid and non-ambiguous or invalid expression of C, then 

A h T(sat)sa L'l^ / W / = T(sat)sa Lsat /w/. 

Proof. If w is a non-ambiguous valid expression of C and Aa its meaning, 
then 

A h Lsat Iwj = (= Aa) 
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and 

A h H^^t)a {Lsat /w/) = ^a, 

SO that 

AhL+Jw/ = L,,t /w/. 
Otherwise, if w is an invahd expression, then by Lemma 6.1 

A h Lsat /w/ = XXaF, 

SO that 

^ ^ Ltat /w/ = (= 'T{sat)sa Lsat ^s) 

and therefore 

A h t(Q,t)a (I/^Qt /w/) = T(sat)sQ, Lgat Xg. 

□ 

Thus, the language defined by (9.2) extends the language JC with all and 
only those pairs {w, T(sat)saLsat /w /) in which w is not a valid non-ambiguous 
expression of £, while similar pairs in which w is such an expression are fully 
contracted. In this sense, such a self- extension of a language C does not 
add anything new to it. However, when a language is nested into another 
language of a recursive grammar, employment of its self-extension enables 
some new and useful features, as established by the following 

Proposition 9.2. Let (A, Lsat) (ind (A, Ksj3t) be pseudo-canonic representa- 
tions of languages C and K, respectively and let uv he a single split of a word 
w such that v is a valid expression of K. 

If, additionally, v is a non- ambiguous expression of JC with the meaning 
Bjj and u an invalid expression of C, then 

(Ltat * Kit) M = 

sat)sa Lsat 

M, Bp). (9.3) 

//, alternatively, uv is a single split of a word w such that u is valid 
expression of C and, additionally, u is a non-ambiguous expression of C with 
the meaning Aa and v - an invalid expression of fC, then 

A h T^sa(3t)s{ax0) {Ljat * K/St) 1^1 = (^"^ n^/3«)^/3 ^sfit /vj). (9.4) 
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Proof. In a similar way to the proof of Proposition 7.7, taking into account 
Proposition 9.1, for the first case we obtain 

A h (L+ , * X+ J /w/ = (= ^sat)sa Lsat M) &(= Bp) 

and for the second case 

A h (L+ , * X+ J /w/ = (= A^) &(= T^smsP Ks^t hi). 

(9.3) and (9.4) follow from this according to (9.1). □ 

The three features of the partial translations (9.3) and (9.4) worth noting 
are: 

1. they capture the correct overall structure of a precise meaning that the 
concatenation would acquire upon elaboration of an incomplete nested 
language; 

2. they allow accurate restoration of their expressions uv — w even while 
a nested language remains incomplete; 

3. translation of only a single constituent is sufficient to convert a partial 
translation to the full translation (A^jS^) upon an elaboration of an 
incomplete nested language. 

For example, consider the self-extension 

=def S V XXs{= T(^stt)st S Xs) 

of the sample grammar we built in section 4. If A^^ denots the same set of 
axioms as in that section, then we will have, for example: 

As \- /Jack paints a house/ 

3a;e(r(s(eet)t)eet Verbt /paint/ Xe Jack A House Xe), 
As \- /Jack builds a computer/ 

3a;e (Build Jack A r(5(et)t)et Noun /computer/ Xg), 

that illustrates both features 1 and 2. Now assume lexicon Noun be ex- 
tended with an entry (= /computer/) =^ Computer and let A'^ denote 
the changed axiom set. Then the derivation 

A's \- (Build ,Te Jack A r(s(ef)t)et Noun /computer/ Xg) = 
(Build Xe Jack A Computer Xe) 



42 



follows immediately from 

h Noun /computer/ = (= Computer), 
that illustrates the feature 3. 

10 Conclusion 

In this work we studied some classes of languages that can be defined entirely 
in terms of provability in an extension {Ty:^) of sorted type theory by inter- 
preting one of its base types as symbolic, whose constants denote symbols of 
an alphabet A or their concatenations. The symbolic type (s) is quite sim- 
ilar to the phonological type of Higher Order Grammar (HOG) (Pollard & 
Hana, 2003; PoUard, 2004; Pollard, 2006). However, the theory Ty^ differs 
from the logic of HOG in that it does not contain special types for syntactic 
entities. 

We launched from the two simple observations. First, given an arbitrary 

consistent set A of non- logical axioms and a term of type sat, one can define 
an a-language, i.e. a relation between strings of symbols from alphabet A 
and terms of type a, by the condition 

A h Lsat /w/ A^, 

where /w/ stands for an s-term denoting a word (string) w E A*. Secondly, 
any Tyn-representable language, i.e. a language that can be defined this 
way, necessarily possesses the semantic property of being logically closed, 
meaning, loosely speaking, that the full semantic interpretation of every valid 
expression of the language is given by a whole class of logically equivalent 
terms. 

Next, we proved that the inverse statement is also true, that is, every 
logically closed language is Ty^-representable. This general result, however, 
is not constructive in the sense that it does not provide an effective procedure 
of actually building a Tyn representation for a language defined by a finite set 
of rules or rule schemata. Our further efforts therefore concentrated mainly 
on finding such procedures for some important special classes of logically 
closed languages. 

We began this with definitions of «-lexicons and a special canonic rep- 
resentation and proved that a language has a canonic representation if and 
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only if it is a lexicon. The canonic representation of a lexicon is efficiently 
found from the description of the lexicon. 

Then we defined and investigated recursive grammars - a kind of trans- 
formational grammars equipped with T|/„ semantics. We introduced Tijn 
operators representing the basic language-construction operations used in 
the recursive grammar rules and proved that a compact Ty„ representation 
for every language component of a recursive grammar comes from its rules 
in a quite straightforward manner. We illustrated this technique by sim- 
ple examples, from which it became clear that language components of a 
recursive grammar stand for syntactic categories. We also briefly discussed 
classiflcation of recursively deflned Ty„-representable languages in terms of 
Chomsky hierarchy and on this basis indicated some conditions of when such 
languages turn out to be decidable, in spite of the fact that Tt/n is undecidable 
and hence so are generally T|/„-representable languages. 

Then we moved to a further specialization of Ty;^ by introducing the con- 
text type (c), which can be interpreted as a "store" or "state" and thus is simi- 
lar to "state of affairs" or "World" types of (Gallin, 1975) and (Pollard, 2005), 
respectively. We showed that our interpretation allows to model not only lan- 
guages with intensional semantics (as in the above referenced works), but also 
so called instructive and context dependent languages. The latter come in our 
formalism in result of transformation of an cc-language to state-passing style 
(SPS) (Wadler, 1992; Sabry, 1995; Jones, 2003) c(c x a)-language, which 
enables passing "side effects" of parsing some text constituents to others. 
An instructive, i.e. a cc-language can be considered as a context dependent 
transformation of a unit-type language and provides semantic features that 
principally could not be modeled by a regular i-language nor by its intensional 
{ct) transformation. We found Tyn representations of context-dependent lan- 
guage construction operations (anaphoric and cataphoric concatenations and 
context raising and instantiation) that allow generalization of the previous re- 
sults for context-dependent languages. We demonstrated how this formalism 
can address pronoun anaphora resolution. 

Finally, we defined translation and expression Ty:^ operators and intro- 
duced a special language representation {self- extension), revealing a useful 
property of partial translation. 

The proposed formahsm efficiently addresses the known issues inherent 
in the two-component language models and also partially inherited by HOG. 
Indeed, 
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1. as there are no restrictions on a type of meanings of a Ty„-representable 

language or a component of it, which type, in particular, can be con- 
structed with use of the symbolic type too, such a language can gen- 
erally express facts about itself or about another Ti/^j-representable 
language; 

2. the property of partial translation of a self-extensible Tyn language 
representation demonstrates its built-in lexical robustness; 

3. as syntactic categories are not captured in the Tyn logic, but intro- 
duced only in non-logical axioms defining a concrete language, the Ty„ 
representation provides full structural robustness and flexibility. 

These results have both theoretical and practical implications. First, they 
facilitate modehng of such fundamental abilities associated with the human 
language acquisition process like communication of new grammar rules and 
lexical entries directly in an (already acquired) sub-set of the object language 
and independently acquiring new lexical entries upon encountering them in 
a known context, with postponed acquisition of their exact meanings. Other 
possible applications of the proposed formalism may include: 

1. addressing donkey pronouns (Geach, 1962; Kamp, 1981; Heim, 1982; 
Elbourne, 2006) and other dynamic semantic phenomena entirely within 
the mainstream higher-order logic semantics, as it was preliminarily 
outlined in section 8; 

2. modeling advanced language self-learning mechanisms, for example, 
such like deriving or generalization of grammar rules from text samples 
(may be accompanied by their semantic interpretation, communicated 
by other means or just expressed differently in the same language), 
that might be reduced, in frame of this formalism, to a problem of 
higher-order unification (Gardent et al., 1997; Dougherty & Martinez, 
2004). 

Finally, an implementation of the proposed formalism, like that suggested 
in (Gluzberg & Brenner, 2006), should provide a platform for building NLP 
systems with the following powerful features: 

1. the ability to recursively define more complex or more comprehensive 
languages in terms of previously defined simpler or limited ones; 
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2. automatically acquiring new lexical entries in the process of operation; 

3. storing partial text parses which can be automatically completed later, 
upon extending the used language definition; 

4. freely switching between difi^erent input and output languages to access 
the same semantic content. 
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